Reinforcement Learning using Kohonen Feature Map Probabilistic Associative Memory based on Weights Distribution

نویسنده

  • Yuko Osana
چکیده

The reinforcement learning is a sub-area of machine learning concerned with how an agent ought to take actions in an environment so as to maximize some notion of long-term reward(Sutton & Barto, 1998). Reinforcement learning algorithms attempt to find a policy that maps states of the world to the actions the agent ought to take in those states. Temporal Difference (TD) learning is one of the reinforcement learning algorithm. The TD learning is a combination of Monte Carlo ideas and dynamic programming (DP) ideas. TD resembles a Monte Carlo method because it learns by sampling the environment according to some policy. TD is related to dynamic programming techniques because it approximates its current estimate based on previously learned estimates. The actor-critic method(Witten, 1977) is the method based on the TD learning, and consists of two parts; (1) actor which selects the action and (2) critic which evaluate the action and the state. On the other hand, neural networks are drawing much attention as a method to realize flexible information processing. Neural networks consider neuron groups of the brain in the creature, and imitate these neurons technologically. Neural networks have some features, especially one of the important features is that the networks can learn to acquire the ability of information processing. The flexible information processing ability of the neural network and the adaptive learning ability of the reinforcement learning are combined, some reinforcement learning method using neural networks are proposed(Shibata et al., 2001; Ishii et al., 2005; Shimizu and Osana, 2008). In this research, we propose the reinforcement learning method using Kohonen Feature Map Probabilistic Associative Memory based on Weights Distribution (KFMPAM-WD)(Osana, 2009). The proposed method is based on the actor-critic method, and the actor is realized by the KFMPAM-WD. The KFMPAM-WD is based on the self-organizing feature map(Kohonen, 1994), and it can realize successive learning and one-to-many associations. The proposed method makes use of this property in order to realize the learning during the practice of task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Variable-Sized Kohonen Feature Map Probabilistic Associative Memory

This paper presents a Variable-sized Kohonen Feature Map Probabilistic Associative Memory for Sequential Patterns (VKFMPAM-SP) which can realize successive (additional) learning for sequential patterns. In the proposed model, since the connection weights of the neurons which are centers of the area corresponding to the stored data are fixed, new patterns can be memorized without destroying memo...

متن کامل

Kohonen Feature Map Associative Memory with Area Representation

In this paper, we propose a Kohonen feature map associative memory with area representation for sequential patterns. This model is based on the Kohonen feature map associative memory with area representation and the Kohonen feature map associative memory for temporal sequences. The proposed model can learn sequential patterns successively, and has robustness for damaged neurons. We carried out ...

متن کامل

RRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features

Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the rank...

متن کامل

The Lamstar Neural Network: a Brief Review

This paper reviews the principles and several different applications of the LAMSTAR (Large Memory Storage and Retrieval) Neural Network. The LAMSTAR was specifically developed for application to problems involving very large memory that relates to many different categories (attributes), where some of the data is exact while other data are fuzzy and where, for a given problem, some data categori...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012